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A POTENTIAL EFFECTOR FOR THE GRB7 FAMILY OF SIGNALLING 

PROTEINS 

Field of the Invention: 

The present invention relates to a novel polynucleotide molecule 
encoding a candidate effector protein for the Grb7 family of signalling 
proteins. Detection of the encoded protein in a tissue sample should provide 
a useful tumour marker and/or prognostic indicator. Furthermore, 
antagonism of the interaction between Grb7 family members and the 
encoded protein should provide a novel treatment strategy for human 
diseases exhibiting aberrant receptor tyrosine kinase (RTK) signalling (e.g. 
cancer). 

Background of the Invention 

RTKs play a major role in the regulation of cellular growth, 
differentiation, motility and metabolism by converting an extracellular signal 
in the form of the binding of a specific hormone or growth factor to the 
activation of specific signalling pathways and hence modes of intracellular 
communication (Schlessinger and Ullrich, Neuron 9. 383-391, 1992). 
Activation of RTKs results in both autophosphorylation of the receptor and 
the phosphorylation of downstream targets on tyrosine residues. It has 
become evident over the last decade that key elements in receptor-substrate 
and other protein-protein interactions in RTK signalling are src homology 
{SH)2 domains. SH2 domains are conserved modules of approximately 100 
amino acids found in a wide variety of signalling molecules which bind to 
short tyrosine-phosphorylated peptide sequences. The specificity of 
interaction is determined both by the nature of the amino acids flanking the 
phosphotyrosine residue in the target peptide and residues in the SH2 
domain which interact with these sites (Pawson, Nature 373, 573-580, 1995). 

SH2-domain containing proteins can be divided into two classes: those 
which possess a catalytic function (e.g. the cytoplasmic tyrosine kinase c-src 
and the tyrosine phosphatase SH-PTP2) and those which consist entirely of 
non-catalytic protein domains (eg Grb2) f the adaptor sub-class. The function 
of the latter class is to link separate catalytic subunits to a tyrosine- 
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phosphorylated receptor or signalling intermediate, and other non-catalytic 
protein modules are often involved in these interactions. For example, SH3 
and WW domains (conserved regions of approximately 50 and 40 amino 
acids, respectively) bind proline-rich peptide ligands. and pleckstrin 
5 homology domains (approximately 100 amino acids) interact with both 
specific phospholipid and protein targets (Pawson, 1995 supra). 

The Grb7 family represents a family of SH2 domain-containing 
adaptors which currently contains three members: Grb7. 10 and 14 (Margolis 
et a/, Proc. Natl. Acad. Sci. USA 89, 8894-8898, 1992: Stein ah EMBO J 13, 

10 1331-1340, 1994; Ooi et al Oncogene 10, 1621-1630. 1995: Daly ef ah /. Biol 
Chem. 271, 12502-12510, 1996). These proteins share a common overall 
architecture, consisting of an N-terminal region containing a highly 
conserved proline-rich decapeptide motif, a central region harbouring a PH 
domain and a C-terminal SH2 domain. The central region of approximately 

15 300 amino acids bears significant homology to the C. elegans protein miglO, 
which is required for long range neuronal migration in embryos, otherwise 
the Gib 7 family and miglO are structurally distinct. However, they exhibit 
differences in both SH2 selectivity towards RTKs (Janes et al, /. Biol Chem. 
272, 8490-8497, 1997) and tissue distribution. The family has therefore 

20 evolved to link particular receptors to downstream effectors in a tissue- 
specific manner. Interestingly, the genes encoding this family appear to have 
co-segregated with ERBB family genes during evolution. Thus GRB7, 10 and 
14 are linked to ERBB2, ERBB1 (epidermal growth factor receptor) and ERBB4, 
respectively (Stein et al 1994 supra; Ooi et al 1995 supra: Baker et al, 

25 Genomics 36, 218-220, 1996). The juxtaposition of GRB7 and ERBB2 leads to 
common co-amplification in human breast cancers, and since the two gene 
products are functionally linked, likely up-regulation of an undefined erbB2 
signalling pathway. Furthermore, GRB14 also exhibits differential expression 
in human breast cancers (Daly et al, 1996 supra). These two proteins may 

30 therefore modulate RTK signalling in this disease. 

In order to identify proteins which bind to this family and therefore 
identify candidate effectors, we performed a genetic screen using the yeast 
two hybrid system and Grbl4 "bait". This application describes the cloning 
and characterization of a novel interacting protein, currently designated 

35 2.2412. 
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Disclosure of the Invention: 

Thus, in a first aspect, the present invention provides an isolated 
polynucleotide molecule encoding a candidate effector protein for the Grb7 
5 family of signalling proteins, wherein the polynucleotide molecule comprises 
a nucleotide sequence having at least 75% sequence identity to that shown as 
SEQ ID NO: 1. 

Preferably, the polynucleotide molecule comprises a nucleotide 
sequence having at least 85%, more preferably at least 95%, sequence 
10 identity to that shown as SEQ ID NO: 1. Most preferably, the polynucleotide 
molecule comprises a nucleotide sequence encoding a polypeptide 
comprising an amino acid sequence substantially corresponding to that 
shown as SEQ ID NO: 2. 

In a preferred embodiment of the invention of the first aspect, the 
15 polynucleotide molecule comprises a nucleotide sequence which 
substantially corresponds to that shown as SEQ ID NO: 1. 

The polynucleotide molecule may be a dominant negative mutant 
which encodes a gene product causing an altered phenotype by, for example, 
reducing or eliminating the activity of endogenous effector proteins of the 
20 Grb7 family of signalling proteins. 

The polynucleotide molecule may be incorporated into plasmids or 
expression vectors (including viral vectors), which may then be introduced 
into suitable host cells such as bacterial, yeast, insect and mammalian host 
cells. Such host cells may be used to express the protein encoded by the 
25 polynucleotide molecule. 

Accordingly, in a second aspect, the present invention provides a host 
cell transformed with the polynucleotide molecule of the first aspect. 

In a third aspect, the present invention provides a method of producing 
a protein, comprising culturing the host cell of the second aspect under 
30 conditions suitable for the expression of the polynucleotide molecule and 
optionally recovering the protein. 

Preferably, the host cell is mammalian or of insect origin. Where the 
cell is mammalian, it is presently preferred that it be a Chinese hamster 
ovary (CHO) cell or human embryonic kidney (HEK) 293 cell. Where the 
35 host cell is of insect origin, it is presently preferred that it be an insect Sf9 
cell. 
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In a fourth aspect, the present invention provides a purified protein 
encoded by the polynucleotide molecule of the first aspect. 

In a preferred embodiment of this aspect, the purified protein 
comprises an amino acid sequence substantially corresponding to that shown 
5 as SEQID NO: 2. 

In a fifth aspect, the present invention provides a fusion protein 
comprising an amino acid sequence substantially corresponding to that 
shown as SEQ ID NO: 2. 

Fusion proteins according to the fifth aspect may include an N- 
terminal fragment of a protein such as p-galactosidase to assist in the 
expression and selection of host cells expressing candidate effector protein, 
or may include a functional fragment of any other suitable protein to confer 
additional activity(ies). 

In a sixth aspect, the present invention provides an antibody or 
fragment thereof which specifically binds to the protein of the fourth aspect. 

The antibody may be monoclonal or polyclonal, however, it is 
presently preferred that the antibody is a monoclonal antibody. Suitable 
antibody fragments include Fab, F(ab') 2 and scFv. 

In a seventh aspect, the present invention provides an oligonucleotide 
probe comprising a nucleotide sequence of at least 12 nucleotides, the 
oligonucleotide probe comprising a nucleotide sequence such that the 
oligonucleotide probe selectively hybridises to the polynucleotide molecule 
of the first aspect under high stringency conditions (Sambrook e* a/., 
Molecular Cloning: a Laboratory Manual, Second Edition, Cold Spring Harbor 
Laboratory Press). 

In a preferred embodiment of this aspect, the oligonucleotide probe is 
labelled. In a further preferred embodiment of this aspect, the 
oligonucleotide probe comprises a nucleotide sequence of at least 18 
nucleotides. 

In an eighth aspect, the present invention provides a method of 
detecting in a sample the presence of an effector protein for the Grb7 family 
of proteins, the method comprising reacting the sample with an antibody or 
fragment thereof the sixth aspect, and detecting the binding of the antibody 
or fragment thereof. 
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The method of the eighth aspect may be conducted using any 
immunoassays well known in the art (e.g. ELISA). The sample may be, for 
example, a cell lysate or homogenate prepared from a tissue biopsy. 

In a ninth aspect, the present invention provides a method of detecting 
5 in a sample the presence of mRNA encoding an effector protein for the Grb7 
family of proteins, the method comprising reacting the sample with an 
oligonucleotide probe of the seventh aspect, and detecting the binding of the 
probe. 

The method of the ninth aspect may be conducted using any 
10 hybridisation assays well known in the art (e.g. Northern blot). The sample 
may be a poly(A) RNA preparation or homogenate prepared from a tissue 
biopsy. 

Grb7 family proteins exhibit differential expression in certain human 
cancers (particularly breast and prostate cancer) and may therefore be 
15 involved in tumour progression. Detection of the protein encoded by the 
cDNA 2.2412 in a sample should provide a useful tumour marker and/or 
prognostic indicator for these cancers. Furthermore, the interaction of Grb7 
family members with 2.2412 may provide a novel target for therapeutic 
intervention. 

20 It is to be understood that methods of detecting suitable agonists and 

methods of therapy utilising detected agonists also form part of the present 
invention. 

The term "substantially corresponds" as used herein in relation to the 
nucleotide sequence shown as SEQ ID NO: 1 is intended to encompass minor 

25 variations in the nucleotide sequence which due to degeneracy in the DNA 
code do not result in a change in the encoded protein. Further, this term is 
intended to encompass other minor variations in the sequence which may be 
required to enhance expression in a particular system but in which the 
variations do not result in a decrease in biological activity of the encoded 

30 protein. 

The term "substantially corresponding" as used herein in relation to the 
amino acid sequences shown as SEQ ID NO: 2 is intended to encompass 
minor variations in the amino acid sequences which do not result in a 
decrease in biological activity of the protein. These variations may include 
35 conservative amino acid substitutions. The substitutions envisaged are:- 
G, A, V, I, L, M: D, E; N, Q; S, T; K, R, H: F, Y, W, H; and 
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P, Na-alkalamino acids. 

The terms "comprise", "comprises" and "comprising" as used 
throughout the specification are intended to refer to the inclusion of a stated 
step, component or feature of group of steps, components of features with or 
without the inclusion of a further step, component or feature or group of 
steps, components or features. 

The invention will hereinafter be described with reference to the 
accompanying figure and the following, non-limiting example. 

Brief description of the accompanying figure: 

Figure 1 provides the nucleotide and amino acid (single letter code) 
sequence of 2.2412. Numbers refer to distances in base pairs. Ankyrin-type repeat 
sequences are underlined. An additional repeat sequence is indicated by italics. 
The stop codon is represented by an asterisk. The original cDNA clone 2.2412 
isolated by the two hybrid screen spans nucleotides 694-2664 of this sequence. 

Figure 2 provides a map of the 2.2412-binding region on Grbl4. 

A. Structure of the deletion constructs used in the analysis. Gal4 DNA-BD fusion 
constructs encoding full length Grbl4 (FL). the N-terminal (N). central region (C) 
and N-terminal + central region (N 4- C) were generated in the vector pAS2.1. 

B. Results of p-galactosidase activity assays following transformation of the above 
plasmids into yeast strain Y190 together with the original 2.2412 cDNA clone in 
pACT-2. 

Example: CLONING AND CHARACTERISATION OF 2.2412 

Yeast two hybrid screen 

The yeast two hybrid system exploits protein-protein interactions to 
reconstitute a functional transcriptional activator which can then be detected 
using a gene reporter system (Fields and Sternglanz. TIG. 10. 286-292. 1994). 
The technique takes advantage of the properties of the Gal4 protein of the 
yeast S. cerevisiae. The Gal4 DNA binding domain (DNA-BD) or activation 
domain (AD) alone are incapable of inducing transcription. However, an 
interaction between two proteins synthesized as DNA-BD- and AD-fusions. 
respectively, brings the Gal4 domains into close proximity and results in 
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transcriptional activation of two reporter genes {HIS3 and LacZ) which can be 
monitored by growth on selective medium and biochemical assays. 

A plasmid construct encoding a Gal4 DNA-BD-Grbl4 fusion was 
generated as follows. The plasmid G/?Bl4/pRcCMVF containing full length 

5 GRB14 cDNA (Daly et aL 1996) was restricted with Hindlll and Klenow 
treated to create blunt ends, and then digested with Bell to release three 
fragments of approximately 1.1, 4.2 and 1.7 kb. The 1.7 kb fragment was 
isolated and cloned into the Ndel (Klenow treated) and BamHI sites of the 
yeast expression vector pAS2.1 (Clontech) to generate GRB14/pAS2,l 

10 containing an in-frame fusion of full length Grbl4 with the GAL4 DNA-BD. 
This construct was introduced by electroporation into the yeast strain 
CG1945 (MATa, ura3-52, liis3-200 t ade2-101 f lys2-801,trpl-901, leu2-3, 112, 
gal4-542, gal80-538, cyh r 2. LYS2::GALlUAS'GALlTATA'HIS3 t 
UBA3::GAL4i7mers(x3)-CYClTATA-tocZ) selecting for tryptophan 

15 prototrophy. The expression of the fusion protein was verified by Western 
blot analysis with antibodies directed against the Flag epitope and the Gal4 . 
DNA-BD. The recipient strain was then grown to mid-log phase and a human 
liver cDNA library in the vector pACT2 (Clontech) introduced using the LiAc 
procedure (Schiestl and Gietz, Curr. Genet 16, 339-346. 1989). Transformants 

20 were then selected for tryptophan, leucine and histidine prototrophy in the 
presence of 5mM 3-aminotriazole. 

From a screen of lxlO 6 clones, 39 colonies were initially selected on 
synthetic complete (SC)-leu-his-trp + 3AT medium and were then tested for 
p-galactosidase activity. 12 clones scored positive in the latter assay and were 

25 subjected to cycloheximide (CHX) curing to remove the bait plasmid by 

streaking out on SC-leu media containing lOug/ml CHX (pAS2-l contains the 
CYH2 gene which restores CHX sensitivity to CG1945 cells). This enabled 
confirmation of the bait dependency of LacZ activation and subsequent 
isolation of the pACT2 plasmids encoding interacting proteins by standard 

30 methodology (Philippsen et al, Methods in Enzynwlogy 194, 170-177). Back 
transformations were then performed in which these pACT2 plasmids were 
introduced into CG1945 strains containing the bait plasmid [GRB1 4/pAS2-l) 
or constructs encoding non-related Gal4 DNA-BD fusions in order to confirm 
the specificity of the interactions. 

35 The DNA sequences of the cDNA inserts were then obtained by cycle 

sequencing (f-mol kit, Promega) using pACT2-specific and/or clone-specific 
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primers. Based on their nucleotide sequences the 12 interacting clones were 
classified into 6 independent groups (see Table I). 

TABLE I: Characterization of cDNA clones isola ted bv the yeast two 
5 hybrid screen . 



10 



15 



Class 


No. of 


Identity 


Mean RLU 


Colour intensity 




clones 




(Liquid assay) 


(Filter assay) 


1 


6 


Nedd4 


2.86xlO r> 


+ + + + 


2 


2 


Htk 


1.86x10 s 




3 


1 


2.2412 


5.18xlO r> 


+ + + + 


4 


1 


Proteosome 


3.88xl0 2 


+/- 


5 


1 


Somatostatin 


1.45xlO n 


+/- 






receptor 






6 


1 


L-arginine:glycine 


8.61xl0 2 


+/- 






amidinotransferase 







Tho 12 clonos exhibiling aclivalion of botli llio /J/S.l nncl Uu:7, reporter genes worn rliviclttd 
20 into 6 groups by sequence analysis of their cDNA inserts. Results »f h-RBlncltniirlM.-. activity assays 
performed using two methodologies nm sl.own. The li.M.in oi.llum-.l.,i iv.«l method (Galacln-I.ight. 
TROPIX) is mora qiianlilntivn: nwulls arn given in men., nilmlvn ligbl .mils (RU J) and are normalized 
for Iha prolnin content .J the samples. Hlu.Vwl.iln screening of lltn cDNA drains wns also pnrfominrl 
using a colony lifl filler assay (Clouted.). Tho intensity or hlun colour development over 
25 approximately 2li is scored from +/- (very weak) Jo + + + + (strong). 

Six clones were partial cDNAs corresponding to Nedd4, a multidomain 
protein containing a calcium-dependent phospholipid binding (CaLB) 
domain, four WW domains and a C-terminal region homologous to the E6-AP 

30 carboxyl-terminus (Kumar et al, Biochem. Biophys. Res. Commun. 185. 1155- 
1161, 1992; Sudol et al f. Biol. Chem. 270, 14733-14741, 1995; Huibregtse et al 
Proc. Natl. Acad. Sci. USA 92, 2563-2567, 1995). The latter is likely to confer 
E3 ubiquitin-protein ligase activity on Nedd4. The pACT2 clones isolated 
encoded the CaLB domain together with the first 22 amino acids of the first 

35 WW domain. 
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Two clones encoded the intracellular region and part of the 
extracellular domain of Htk. which is a RTK of the Eph family (Bennett et al 
J. Biol Chem. 269. 14211-14218. 1994). The recruitment of Grbl4 by Htk is of 
interest for two reasons. First, the expression profile of both Htk and the 
5 murine homologue myk-1 are indicative of a potential role in mammary 

gland development and neoplasia (Andres et al Oncogene 9. 1461-1467. 1994: 
Berclaz et al Biochem. Biophys. Res. Comm. 226. 869-875. 1996). Second. Eph 
family members may be involved in the regulation of cell migration (Tessier- 
Lavigne, Cell 82 f 345-348. 1995). which is intriguing given the homology of 

10 the Grb7 family to the C elegans protein miglO (Stein et ah 1994 supra). 

A novel cDNA of 1971 bp. designated 2.2412. was also isolated. This clone 
encoded a polypeptide of 657 amino acids in frame with the Gal4 DNA-BD. The 
cDNA did not contain a stop codon. and this, together with the Northern analysis 
described below, indicated that it was incomplete. This DNA fragment was 

15 therefore used as a probe to screen a human placental cDNA library (5* STRETCH 
PLUS. Clontech. in XgtlO). This resulted in the isolation of two clones, designated 
clone 8 and clone 12. Clone 8 was approximately 2 kb and overlapped the original 
2.2412 clone by 900 bp at the 3' end. This clone provided the carboxy-terminal 
end of the 2.2412 protein sequence (Figure 1). Clone 12 was approximately 3.5 kb 

20 and to date has provided an additional 692 bp of sequence information in the 5' 
direction. The nucleotide and protein sequence for 2.2412 provided by these 
overlapping clones is shown in Figure 1. Since a 5' initiation codon has yet to be 
identified the coding sequence still appears to be incomplete. 

25 Furtlier characterization of 2.2412 

Database searches using the 2.2412 cDNA sequence revealed 
significant homology with a large number of proteins containing ankyrin-like 
repeats. These sequences were first identified as homologous regions 
between certain cell cycle regulatory proteins and the Drosophila protein 

30 Notch (Breeden and Nasmyth. Nature 329. 651-654. 1987) but subsequently 
they have been identified in a wide variety of other proteins where they are 
thought to function in protein-protein interactions (Bork. Proteins 17, 363- 
374, 1993). Subsequent analysis of the protein sequence identified 18 
consecutive ankyrin repeats and an additional repetitive element (Figure 1). 

35 The ankyrin repeat region is followed by a stretch of approximately 40 amino 
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acids rich in serine residues. The remaining C-terminal region has a 
relatively high content of charged amino acids. 

Nortliern analysis of 2.2412 mRNA expression 

Northern blot analysis of multiple tissue northerns (Clontech) was 
performed using the original 2.2412 cDNA as a probe. This resulted in the 
detection of a single mRNA transcript of approximately 7 kb in all tissues 
examined with the exception of the kidney. Expression was particularly high 
in skeletal muscle and placenta. The size of this transcript compared to that 
of the 2.2412 clone indicates that the latter represents only a partial cDNA. 

Genomic localization of the 2.2412 gene 

Fluorescence in situ hybridization of the original 2.2412 cDNA to 
normal metaphases (Baker et al, 1996 supra) and reference to the FRA10A 
fragile site at 10q23.32 localized the gene to between chromosome 10q23.2 
and proximal 10q23.32. Interestingly, deletions in the 10q22-25 region of 
chromosome 10 have been detected in a variety of human cancers including 
breast, prostate, renal, small cell lung and endometrial carcinomas, 
glioblastoma multiforme, melanoma and meningiomas, suggesting the 
presence of one or more tumour suppressive loci in this region (Li et al, 
Science 275, 1943-1947. 1997; Steckef al, Nature Genetics 15, 356-362. 1997, 
and references therein). Two candidate tumour suppressor genes have been 
identified in this region (MMACl/PTEN and MXIl. Li et al 1997 supra: Steck 
et al 1997 supra] Albarosa et al, Hum. Genet 95, 709-711. 1995). 

Analysis of the interaction between 2.2412 and Grb7 family members 

cDNAs encoding the full length and N- and C-terminal regions of the 
original 2.2412 cDNA clone (nucleotides 694-2664, 694-1614 and 1615-2664 
of the sequence shown in Figure 1, respectively) were cloned into the vector 
pGEX4T2 (Pharmacia). The full length construct was generated by 
subcloning from the pACT2 clone as a Ndel fragment, whereas the shorter 
constructs were synthesized by directional cloning of PCR products. The 
corresponding GST-fusion proteins were purified from IPTG-induced 
bacterial cultures using glutathione-agarose beads (Smith and Johnson, Gene 
67, 31-40, 1988). These immobilized fusion proteins were then incubated 
with lysates from cells expressing Flag epitope-tagged Grbl4 (Daly et al, 1996 
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supra) or human breast cancer cells expressing high levels of Grb7 (SK-BR-3; 
Stein et al, 1994) as described previously (Daly et al. 1996). Following 
washing, bound proteins were detected by Western blot analysis. The results 
indicated that 2.2412 bound specifically to both Grbl4 and Grb7ni vitro, and 
that the N-terminal fusion protein bound more strongly than that derived 
from the C-terminus. These data, obtained using a different methodology for 
detecting protein-protein interactions to the yeast two hybrid system, confirm 
that 2.2412 interacts with Grbl4. Furthermore, 2.2412 also binds Grb7. 
Consequently 2.2412 appears to represent a general effector for the Grb7 
family. 

Mapping oftlw 2.2412 binding region on Grbl4 

In order to identify the region of Grbl4 that interacts with 2.2412, a 
series of Grbl4 deletion mutants were generated by cloning PCR fragments 
synthesized using the appropriate flanking primers into the vector pAS2.1. 
These fragments spanned the following regions: N-terminus ("N\ amino acids 
1-110), the central region ( M C M ) encompassing the miglO homology and the 
"between PH and SH2" (BPS) domain (amino acids 110-437) and the N- 
terminal and central regions ("N + C'\ amino acids 1-437). These plasmids 
were individually transformed into the yeast strain Y190 (MATa, ura3-52, 
his3-200, ade2-101, lys2-801, trpl-901, leu2-3, 112, gal4A. gal80A, cyh r 2, 
LYS2::GAL1UAS'H1S3TATA-HIS3, URA3::GALlUAS-GALlTATA-lacZ) and 
expression of the appropriately sized Gal4 DNA-BD fusion proteins 
confirmed by Western blotting. Following transformation of the resulting 
yeast strains with the original 2.2412 cDNA clone in pACT-2, the strength of 
the interaction was determined by either liquid- or filter-based 0- 
galactosidase assays. The results are presented in Figure 2, and demonstrate 
that the N-terminal region of Grbl4 is not only required, but is also 
sufficient for binding 2.2412. This supports the hypothesis that 2.2412 
represents a general effector for the Grb7 family, since the N-terminal region 
of these proteins contains a highly conserved proline-rich motif which may 
mediate this interaction. 
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It will be appreciated by persons skilled in the art that numerous 
variations and/or modifications may be made to the invention as shown in 
the specific embodiments without departing from the spirit or scope of the 
invention as broadly described. The present embodiments are, therefore, to 
be considered in all respects as illustrative and not restrictive. 



WO 99/15647 



PCT/AU98/00795 



13 



Sequence listings : 

SEQUENCE LISTING 

Applicant: Garvan Institute of Medical Research 

Title of Invention: A potential effector for the Grb7 family of signalling 
proteins . 



Current Application Number: 
Current Filing Date: 

Prior Application Number : P09388 

Prior Application Filing Date: 1997-09-23 

Number of ID SEQ Nos : 2 



Software: PatentlnVer. 2.0 

SEQ ID NO: 1 
Length: 3400 
Type : DNA 

Organism: Homo sapiens 



Sequence: ] 

attcctcttc 

catggtgcag 

attaaaggaa 

cgaaatacag 

actggtgaat 

atgatggctc 

actccattac 

catggacgtg 

tcttatggtc 

atggacttgt 

tgttctcttc 

gctatagact 

cactcgttgc 

ctggaaatgg 

gcatctccat 

atcaatgaaa 

aatgatgttg 

ggtcagactt 

ctgagctatg 

ggaaatgaaa 

gacagacaat 

actgttcaga 

gcagctgggt 

catgctaaag 

gaagttgcag 

tttacacctt 

cagcatggtg 

aaagatggag 

gccaagaagg 

cgcgataccc 

gaagttgcag 

cttattcctt 



ataatgcatg 
accccaatgc 
agattgatgt 
atggaaggac 
ataagaaaga 
tactcacacc 
atttggcagc 
atgtccatgc 
attatgaagt 
ggcaattcac 
tcttaagtta 
tggctcccac 
tgcaagctgc 
tgaatttcaa 
atcccaaaag 
agactaaaga 
ttgaagtagt 
ctctacacag 
ggtgtgatcc 
atgtacagca 
tgctggaagc 
gtgtcaactg 
ataacagagt 
ataaaggagg 
aacttcttgt 
tacatgaagc 
cagaccctac 
atacagatat 
gttgtttagc 
aaggcagaca 
agtatttgtt 
tacataatgc 



ctcttttggt 
tcgagataat 
ttgcattgtg 
agcattggat 
tgaactctta 
attaaatgtc 
aggatataac 
taaagataaa 
aactgaactt 
tcctcttcat 
tggtgcagac 
accacagtta 
acgagaagct 
gcatcctcaa 
aaagcaaata 
attcttgact 
ggtgaaacat 
agctgcatat 
taacattata 
actcctccaa 
tgcaaaggct 
cagagaca tt 
gtccgtggtg 
ccttgtacct 
taaacatgga 
agcagcaaaa 
aaaaaaaaac 
tcaagatctg 
cagagtgaag 
ttcaacacct 
acaacacgga 
agca tcttac 



catgctgaag 
tggaattata 
ctgttacagc 
ttagcagat c 
gaaagtgcca 
aactgccacg 
agagtaaaga 
ggtgatctgg 
ttggtcaagc 
gaggcagctt 
ccaacactgc 
aaagaaagat 
gatgttactc 
acacatgaaa 
tgtgaactgt 
cctctgcacg 
gaagcaaagg 
tgtggtcatc 
tcccttcagg 
gagggtatct 
ggagatgtcg 
gaagggcgtc 
gaatatctgc 
ttgcacaatg 
gcagtagtta 
ggaaaatatg 
agggatggaa 
cttaggggag 
aagttgtctt 
ttacatttag 
gctgatgtga 
gggcatgtag 



tagtcaatct 
ctcctctcca 
atggagctga 
catctgccaa 
ggagtggcaa 
caagtgatgg 
ttgtacagct 
taccattaca 
atggtggctg 
ctaagaacag 
tcaattgtaa 
tagcatatga 
gaatcaaaaa 
cagcattgca 
tgctaagaaa 
tggcatctga 
ttaatgctct 
tacaaacctg 
gctttactgc 
cattaggtaa 
aaactgtaaa 
agtctacacc 
tacagcatgg 
catgttctta 
atgtagctga 
aaatttgcaa 
atactccttt 
atgcagcttt 
ctcctgataa 
cagctggtta 
atgcccaaga 
atgtagcagc 



ccttttgcga 
tgaagctgca 
gccaaccatc 
agcagtgctt 
tgaagaaaaa 
cagaaagtca 
gt tactgcaa 
caatgcctgt 
tgtaaatgca 
ggttgaagta 
gaataaaagt 
atttaaaggc 
acatctctct 
ttgtgctgct 
aggagcaaac 
gaaagctcat 
ggataatctt 
ccgcctactc 
tttacagatg 
ttcagaggca 
aaaactgtgt 
acttcatttt 
agctgatgtg 
cggacattat 
tttatggaaa 
acttctgctc 
ggatcttgtt 
gctagatgct 
tgtaaattgc 
taataattta 
caaaggagga 
tctactaata 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 
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aagtataatg 
gcccaaaagg 
cttaaaaatc 
cttctgacag 
aatggtgtga 
tcaagccttt 
tcagtagtta 
ggagtagatt 
atatttgaga 
aaggagattg 
cttatctccg 
acaattctta 
caaagtacag 
aatattctca 
agaaaagaag 
tctccttttg 
ggtatgtttg 
tatggaattg 
cacaggcagc 
atgaaaatgg 
ggcctagcat 
attacttacc 
aactaattcc 
tgctgaaaaa 
gtttaacatt 



catctctcaa 
gacgaacaca 
aggaaggaca 
cagccatgcc 
gaagcccagg 
ctgcagccag 
gttcaagtgg 
ttagcataac 
gagaacagat 
gaatcaatgc 
gacaacaagg 
tagatctgtc 
ttcgagagca 
agattcagaa 
tttctgaaga 
tgaatgcaat 
gagctggcat 
gaggaggtac 
tgctcttttg 
cacattctcc 
tagctgaata 
agattatgag 
actgaaccta 
aaatcatctt 
ctgacttgat 



SEQ ID NO: 2 
Length: 1074 
Type : PRT 
Organism: Homo 



tgccacggac 
gctttgtgct 
aacaccttta 
cccatctgct 
agccactgca 
cagtcttgac 
aacagagggt 
tcaattcgta 
cactttggat 
ttatggacat 
tcttaaccca 
tcctgatgat 
cagagatgga 
ggtttgtaac 
aaaccacaac 
tatccacaaa 
ttattttgct 
tgggtgtcca 
ccgggtaacc 
tccaggtcat 
tgttatttac 
gcctgaaggt 
aaatcatcaa 
gcccacaggc 
aaagctttaa 



aaatgggctt 
ttgttgctag 
gatttagttt 
ctgccctctt 
gatgctctct 
aacttatctg 
gcttccagtt 
aggaatcttg 
gtattagttg 
aggcacaaac 
tatttaactt 
aaagagtttc 
ggtcatgcag 
aagaaactat 
catgccaatg 
ggctttgatg 
gaaaactctt 
gttcacaaag 
ttgggaaagt 
cactcagtca 
agaggagaac 
atggtcgatg 
agcagcagtg 
ctgtggcaaa 
taatgtacag 



tcacacct tt 
cccatggagc 
cagcagatga 
gttacaagcc 
cttcaggtnc 
ggagtttttc 
tggagaaaaa 
gacttgagca 
agatggggca 
taattaaagg 
tgaacacctc 
agtctgtgga 
gtggaatctt 
gggaaagata 
aacgaatgct 
aaaggcatgc 
ccaaaagcaa 
acagatct tg 
ctttcctgca 
ctggtaggcc 
aggcttatcc 
gataaatagt 
gcctctacgt 
aggataaaaa 



gcacgaagca 
tgacccgact 
tgtcagcgct 
tcaagtgctc 
atctagccca 
agaactgtct 
ggaggttcca 
cctaatggat 
caaggagctg 
agtcgagaga 
tggtagtgga 
ggaagagatg 
caacagatac 
cactcaccgg 
atttca tggg 
gtacataggt 
tcaatatgta 
ttacatttgc 
gttcagtgca 
cagtgtaaat 
tgagtattta 
tattttaaga 
tttactcctt 
tgtgaacgaa 



1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3400 



sapxens 



Sequence: 2 

lie Pro Leu His Asn Ala 
1 5 



Cys Ser Phe Gly His Ala Glu Val Val Asn 
10 15 



Leu Leu Leu Arg 
20 



His Gly Ala Asp 



Pro Asn Ala 
25 



Arg Asp 



Asn Trp Asn 
30 



Tyr Thr Pro Leu 
35 

lie Val Leu Leu 
50 



His Glu Ala Ala 
40 

Gin His Gly Ala 
55 



lie Lys Gly 



Glu Pro Thr 



Lys lie 
45 

lie Arg 
60 



Asp Val Cys 
Asn Thr Asp 



Gly Arg Thr Ala 

65 

Thr Gly Glu Tyr 



Leu Asp Leu Ala 
70 

Lys Lys Asp Glu 
85 



Asp Pro Ser 
75 

Leu Leu Glu 
90 



Ala Lys 
Ser Ala 



Ala Val Leu 
80 

Arg Ser Gly 
95 



Asn Glu Glu Lys 
100 



Met Met Ala Leu 



Leu Thr Pro 
105 



Leu Asn 



Val Asn Cys 
110 



His Ala Ser Asp 
115 

Tyr Asn Arg Val 
130 



Gly Arg Lys Ser 
120 

Lys lie Val Gin 

135 



Thr Pro Leu 



Leu Leu Leu 



His Leu 
125 

Gin His 
140 



Ala Ala Gly 
Gly Arg Asp 



Val His Ala Lys Asp Lys Gly Asp Leu Val Pro Leu His Asn Ala Cys 
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145 150 155 160 

Ser Tyr Gly His Tyr Glu Val Thr Glu Leu Leu Val Lys His Gly Gly 
165 170 175 

Cys Val Asn Ala Met Asp Leu Trp Gin Phe Thr Pro Leu His Glu Ala 
180 185 190 

Ala Ser Lys Asn Arg Val Glu Val Cys Ser Leu Leu Leu Ser Tyr Gly 
195 200 205 

Ala Asp Pro Thr Leu Leu Asn Cys Lys Asn Lys Ser Ala lie Asp Leu 
210 215 220 

Ala Pro Thr Pro Gin Leu Lys Glu Arg Leu Ala Tyr Glu Phe Lys Gly 
225 230 235 240 

His Ser Leu Leu Gin Ala Ala Arg Glu Ala Asp Val Thr Arg He Lys 
245 250 255 

Lys His Leu Ser Leu Glu Met Val Asn Phe Lys His Pro Gin Thr His 
260 265 270 

Glu Thr Ala Leu His Cys Ala Ala Ala Ser Pro Tyr Pro Lys Arg Lys 
275 280 285 

Gin He Cys Glu Leu Leu Leu Arg Lys Gly Ala Asn He Asn Glu Lys 
290 295 300 

Thr Lys Glu Phe Leu Thr Pro Leu His Val Ala Ser Glu Lys Ala His 
305 310 315 320 

Asn Asp Val Val Glu Val Val Val Lys His Glu Ala Lys Val Asn Ala 
325 330 335 

Leu Asp Asn Leu Gly Gin Thr Ser Leu His Arg Ala Ala Tyr Cys Gly 
340 345 350 

His Leu Gin Thr Cys Arg Leu Leu Leu Ser Tyr Gly Cys Asp Pro Asn 
355 360 365 

He He Ser Leu Gin Gly Phe Thr Ala Leu Gin Met Gly Asn Glu Asn 
370 375 380 

Val Gin Gin Leu Leu Gin Glu Gly He Ser Leu Gly Asn Ser Glu Ala 
385 390 395 400 

Asp Arg Gin Leu Leu Glu Ala Ala Lys Ala Gly Asp Val Glu Thr Val 
405 410 415 

Lys Lys Leu Cys Thr Val Gin Ser Val Asn Cys Arg Asp He Glu Gly 
420 425 430 

Arg Gin Ser Thr Pro Leu His Phe Ala Ala Gly Tyr Asn Arg Val Ser 
435 440 445 

Val Val Glu Tyr Leu Leu Gin His Gly Ala Asp Val His Ala Lys Asp 
450 455 460 

Lys Gly Gly Leu Val Pro Leu His Asn Ala Cys Ser Tyr Gly His Tyr 
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465 470 475 480 

Glu Val Ala Glu Leu Leu Val Lys His Gly Ala Val Val Asn Val Ala 
485 490 495 

Asp Leu Trp Lys Phe Thr Pro Leu His Glu Ala Ala Ala Lys Gly Lys 
500 505 510 

Tyr Glu lie Cys Lys Leu Leu Leu Gin His Gly Ala Asp Pro Thr Lys 
515 520 525 

Lys Asn Arg Asp Gly Asn Thr Pro Leu Asp Leu Val Lys Asp Gly Asp 
530 535 540 

Thr Asp He Gin Asp Leu Leu Arg Gly Asp Ala Ala Leu Leu Asp Ala 
545 550 555 560 

Ala Lys Lys Gly Cys Leu Ala Arg Val Lys Lys Leu Ser Ser Pro Asp 
565 570 575 

Asn Val Asn Cys Arg Asp Thr Gin Gly Arg His Ser Thr Pro Leu His 
580 585 590 

Leu Ala Ala Gly Tyr Asn Asn Leu Glu Val Ala Glu Tyr Leu Leu Gin 
595 600 605 

His Gly Ala Asp Val Asn Ala Gin Asp Lys Gly Gly Leu He Pro Leu 
610 615 620 

His Asn Ala Ala Ser Tyr Gly His Val Asp Val Ala Ala Leu Leu He 
625 630 635 640 

Lys Tyr Asn Ala Ser Leu Asn Ala Thr Asp Lys Trp Ala Phe Thr Pro 
645 650 655 

Leu His Glu Ala Ala Gin Lys Gly Arg Thr Gin Leu Cys Ala Leu Leu 
660 665 670 

Leu Ala His Gly Ala Asp Pro Thr Leu Lys Asn Gin Glu Gly Gin Thr 
675 680 685 

Pro Leu Asp Leu Val Ser Ala Asp Asp Val Ser Ala Leu Leu Thr Ala 
690 695 700 

Ala Met Pro Pro Ser Ala Leu Pro Ser Cys Tyr Lys Pro Gin Val Leu 
705 710 715 720 

Asn Gly Val Arg Ser Pro Gly Ala Thr Ala Asp Ala Leu Ser Ser Gly 
725 730 735 

Pro Ser Ser Pro Ser Ser Leu Ser Ala Ala Ser Ser Leu Asp Asn Leu 
740 745 750 

Ser Gly Ser Phe Ser Glu Leu Ser Ser Val Val Ser Ser Ser Gly Thr 
755 760 765 

Glu Gly Ala Ser Ser Leu Glu Lys Lys Glu Val Pro Gly Val Asp Phe 
770 775 780 

Ser He Thr Gin Phe Val Arg Asn Leu Gly Leu Glu His Leu Met Asp 
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785 790 795 800 

He Phe Glu Arg Glu Gin He Thr Leu Asp Val Leu Val Glu Met Gly 
805 810 815 

His Lys Glu Leu Lys Glu lie Gly He Asn Ala Tyr Gly His Arg His 
820 825 830 

Lys Leu He Lys Gly Val Glu Arg Leu lie Ser Gly Gin Gin Gly Leu 
835 840 845 

Asn Pro Tyr Leu Thr Leu Asn Thr Ser Gly Ser Gly Thr He Leu He 
850 855 860 

Asp Leu Ser Pro Asp Asp Lys Glu Phe Gin Ser Val Glu Glu Glu Met 
865 870 875 880 

Gin Ser Thr Val Arg Glu His Arg Asp Gly Gly His Ala Gly Gly He 
885 890 895 

Phe Asn Arg Tyr Asn He Leu Lys He Gin Lys Val Cys Asn Lys Lys 
900 905 910 

Leu Trp Glu Arg Tyr Thr His Arg Arg Lys Glu Val Ser Glu Glu Asn 
915 920 925 

His Asn His Ala Asn Glu Arg Met Leu Phe His Gly Ser Pro Phe Val 
930 935 940 

Asn Ala He He His Lys Gly Phe Asp Glu Arg His Ala Tyr He Gly 
945 950 955 960 

Gly Met Phe Gly Ala Gly He Tyr Phe Ala Glu Asn Ser Ser Lys Ser 
965 970 975 

Asn Gin Tyr Val Tyr Gly He Gly Gly Gly Thr Gly Cys Pro Val His 
980 985 990 

Lys Asp Arg Ser Cys Tyr He Cys His Arg Gin Leu Leu Phe Cys Arg 
995 1000 1005 

Val Thr Leu Gly Lys Ser Phe Leu Gin Phe Ser Ala Met Lys Met Ala 
1010 1015 1020 

His Ser Pro Pro Gly His His Ser Val Thr Gly Arg Pro Ser Val Asn 
1025 1030 1035 1040 

Gly Leu Ala Leu Ala Glu Tyr Val He Tyr Arg Gly Glu Gin Ala Tyr 
1045 1050 1055 

Pro Glu Tyr Leu He Thr Tyr Gin He Met Arg Pro Glu Gly Met Val 
1060 1065 3070 



Asp Gly 
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Claims: 

1. An isolated polynucleotide molecule encoding a candidate effector 
protein for the Grb7 family of signalling proteins, wherein the polynucleotide 
molecule comprises a nucleotide sequence having at least 75% sequence 
identity to that shown as SEQ ID NO: 1. 

2. A polynucleotide molecule according to claim 1, wherein the 
polynucleotide molecule comprises a nucleotide sequence having at least 
85% sequence identity to that shown as SEQ ID NO: 1. 

3. A polynucleotide molecule according to claim 1, wherein the 
polynucleotide molecule comprises a nucleotide sequence having at least 
95% sequence identity to that shown as SEQ ID NO: 1. 

4. A polynucleotide molecule according to claim 1, wherein the 
polynucleotide molecule comprises a nucleotide sequence which 
substantially corresponds to that shown as SEQ ID NO: 1. 

5. A host cell transformed with a polynucleotide molecule according to 
any one of the preceding claims. 

6. A host cell according to claim 5, wherein the host cell is a mammalian, 
insect, yeast or bacterial host cell. 

7. A method of producing a protein, comprising culturing the host cell of 
claim 5 or 6 under conditions suitable for the expression of the 
polynucleotide molecule and optionally recovering the protein. 

8. A purified protein encoded by a polynucleotide molecule according to 
any one of claims 1 to 4. 

9. A purified protein according to claim 8, wherein the protein comprises 
an amino acid sequence substantially corresponding to that shown as SEQ ID 
NO: 2. 



WO 99/15647 



PCT/AU98/00795 



19 

10. A fusion protein comprising an amino acid sequence substantially 
corresponding to that shown as SEQ ID NO: 2. 

11. An antibody or fragment thereof which specifically binds to a protein 
5 according to claim 8 or 9. 

12. An oligonucleotide probe comprising a nucleotide sequence of at least 
12 nucleotides, the oligonucleotide probe comprising a nucleotide sequence 
such that the oligonucleotide probe selectively hybridises to the 

10 polynucleotide molecule of any one of claims 1 to 4 under high stringency 
conditions. 

13. An oligonucleotide probe according to claim 12. wherein the 
oligonucleotide probe comprises a nucleotide sequence of at least 18 

15 nucleotides. 

14. A method of detecting in a sample the presence of an effector protein 
for the Grb7 family of proteins, the method comprising reacting the sample 
with an antibody or fragment thereof according to claim 11. 

20 

15. A method of detecting in a sample the presence of mRNA encoding an 
effector protein for the Grb7 family of proteins, the method comprising 
reacting the sample with an oligonucleotide probe of claim 12 or 13. 
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FIGURE 1 

ATTCCTCTTCATAATGCATGCTCTTTTGGTCATGCTGAAGTAGTCAATCTCCTTTTGCGACATGGTGCAG 70 
IPLHNACSFGHAEV VNLLLRHGA 

accccaatcx:tcgagataattc^ttatactc^^ 1 4 0 

D P N A R D N WNYT P LH EAAI K G K _I D_V 

ttgcattgtgctgttacagcatggagctgagccaacca^^ 210 

CIVLLQHGAEPTIRNT D G R T A L D 

TTAGCAGATCCATCTGCCAAAGCAGTGCTTACTGGTGAATATAAGAAAGATGAACTCTTAGAAAGTGCCA 2 8 0 
LADPSAKAVLTGEYKKDEL LESA 

GGAGTGGCAATGAAGAAAAAATGATGGCTCTAC TCACACCATTAAATGTCAACTGC C AC GCAAGTGATGG 350 
R S G NEEKMMALLT PLNVNCHAS D__G 

CAGAAAGTCAACTCCATTACATTTGGCAGCAGGATATAACAGAGTAAAGATTGTACAGCTGTTACTGCAA 420 
RKSTPLHLAAGYNR VKIVQLLLQ 

CATGGACGTGATGTCCATGCTAAAGATAAAGGTGATCTGGTACCAT TACAC AATGCCT GTTCTTATGGTC 490 
HGRDVHAKDK GDLV PLHNACSYG 

ATTAT GAAGTAAC TGAAC TTTTGGTC AAGC AT GGTGGCTGTGTAAAT GC AAT GGAC T T GT GGC AATT C AC 560 
HYEVTELLVKHGGCVNAMDL W Q F T 

T C C T CTT C ATGAGGC AGC TTC T AAGAAC AGGGTT GAAGTAT GTT C TC T T C TC TT AAGTT AT GGTGC AGAC 630 
PLHEAASKNRVEV CSLLLS Y G A D 

CCAACACTGCTCAATTGTAAGAATAAAAGTGC TAT AGAC TTGGCTCCCACAC CACAGTTAAAAGAAAGAT 700 
p T L L N C KNKSAIDLAPTPQLKER 

T AGCAT ATGAATTT AAAGGCCACTC GTT GC T GCAAGC TGCACGAGAAGCTGAT GTT AC TCGAATC AAAAA 770 
LAYEFKGHSLLQAAREADVTRIKK 

ACATCTCTCTCT 



GGAAAT GGTGAATTT C AAGC AT C C TC AAAC AC AT GAAAC AGC ATTGC AT TGTGC T GC T 8 4 0 



HLSLEMVNF 



K H P Q THETALHCAA 



GCAT C TC CATATCC C AAAAGAAAGC AAAT AT GT GAAC T GTT GC TAAGAAAAGGAGC AAAC ATC AATGAAA 910 
ASPYPKRKQICELLLRKGAN I N E 

AGACTAAAGAATT C TTGACT C CT C T GCACGT GGC AT C TGAGAAAGCTCAT AAT GAT GTTGTTGAAGTAGT 980 
K T K EFLTPLHVASEKAH NDVVEVV 

GGT GAAACATGAAGC AAAGGTT AAT GC T C T GGATAAT C TTGGT C AGAC TT CTC TACAC AGAGC T GCAT AT 1050 
VKHEAKVNALDN L G Q T S L H R A A Y 

T GT GGT C ATC TACAAACC T GCC GCC TAC T C C TGAGC T ATGGGTGT GATC C TAACATT AT AT CCC TTC AGG 1120 
CGHLQTCRLLLSYG CDPN? I S L Q 

GCTTTACTGC TTT AC AGATGGGAAATGAAAATGT ACAGCAACTCC T C CAAGAGGGT ATC T CATTAGGT AA 1190 
GFTALQMGNENVQQLLQE G I S L — G_N 

T T CAGAGGCAGAC AGAC AAT T GCT GGAAGCT GC A7VAGGCT GGAGATGT C GAAAC T GT AAAAAAACT GTGT 1260 
5EADRQLL EAAKAGDVET V K K L C 

AC T GT T C AGAGTGTC AAC TGC AGAGAC AT TGAAGGGC GT CAGTC TACAC C AC TT C ATTTTGC AGC TGGGT 1330 
TVQSVNCRDIEG RQST PLHFAAG 

AT AAC AGAGT GT CC GTGGT GGAATATC TGCT AC AGC AT GGAGCTGAT GTGC ATGC T AAAGAT AAAGGAGG 1400 
YNRVSVVEYLLQHGADVHA K D K_ G_G 

C C TTGTAC CTTTGC AC AATGC ATGTTC TT AC GGACAT TAT GAAGTTGC AGAAC TT C TTGTTAAACAT GGA 1470 
LVPLHNACSYGHYE VAELLVKHG 

GCAGTAGTTAATGTAGCTGATTTATGGAAATTTACACCTTTACATGAAGCAGCAGCAAAAGGAAAATATG 1540 
AVVNVADL WKFTPLHEAAA K G — K — Y 

AAATTT GC AAAC TTC T GC T CC AGCATGGT GCAGAC C CT AC AAAAAAAAAC AGGGAT GGAAATAC T C C TT T 1610 
E I CKLLLQHGADPTKKNR D G N T P L 
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GGATCTTGTTAAAGATGGAGATACAGA.TATTCAAC3ATCTGCTTAGGGGAGATGCAGCTTTGCTAGATGCT 1680 

pLVKDGDTDI QDLL R G D A A L. ±. u « 

GCCAAGAAGGGTTGTTTAGCCAGAGTGAAGAAGTTGTCTTCTCCTGATAATGTAAATTGCCGCGATACCC 1750 
A K K G CLARVKKLSSPDNVNCRDT 

AAGGCAGACATTCAACACCTTTACATTTAGCAGCTGGTTATAATAATTTAGAAGTTGCAGAGTATTTGTT 1820 
0 ^ RHSTPLHL A AGYNNLEVAEYLL 

ACAACACGGAGCTGATGTGAATGCCCAAGACAAAGGAGGACTTATTCCTTTACATAATGCAGCATCTTAC 1890 
QHGADVNAQDK G G L I P L H N A A a 1 

ggkatgtagatgtao^tctactaataaagta™ 1 960 

GHVDVAALLIKYWA SLNATDK W__* 
TCACACCTTTGCACGAAGCAGCCCAAAAGGGACGAACACAGCTTTGTGCTTTGTTGCTAGCCCATGGAGC 2030 



F T P L H E A A Q 



K GRTQLCALLLAHG A 



TGACCCGACTCTTAAAAATCAGGAAGGACAAACACCTTTAGATTTAGTTTCAGCAGATGATGTCAGCGCT 2 100 
D P T L K N Q EGQTPLD LVSADDVSA 

CTTCTGACAGCAGCCATGCCCCCATCTGCTCTGCCCTCTTGTTACAAGCCTCAAGTGCTCAATGGTGTGA 2170 
LLTAAMPPSA T. PSCYKP Q V L N G V 

GAAGCCCAGGAGCCACTGCAGATGCTCTCTCTTCAGGTCCATCTAGCCCATCAAGCCTTTCTGCAGCCAG 2240 
RSPGATADALSSGPSSPSSLSAAS 

CAGTCTTGACAACTTATCTGGGAGTTTTTCAGAACTGTCTTCAGTAGTTAGTTCAAGTGGAACAGAGGGT 2310 
SLDNLSGSFSELSSVVSSSGTEG 

GCTTCCAGTTTGGAGAAAAAGGAGGTTCCAGGAGTAGATTTTAGCATAACTCAATTCGTAAGGAATCTTG 2380 
ASSLEKKEVPGVDFSITQFVRNL 

GACTTGAGCACCTAATGGATATATTTGAGAGAGAACAGATCACTTTGGATGTATTAGTTGAGATGGGGCA 2450 
GLEHLMDIFEREQITLDVLVEMGH 

CAAGGAGCTGAAGGAGATTGGAATCAATGCTTATGGACATAGGCACAAACTAATTAAAGGAGTCGAGAGA 2520 
KELKE1GINAYGHRHKLIKGVER 

CTTATCTCCGGACAACAAGGTCTTAACCCATATTTAACTTTGAACACCTCTGGTAGTGGAACAATTCTTA 2590 
LISGQQGLNPYLTLNTSGSGTIL 

TAGATCTGTCTCCTGATGATAAAGAGTTTCAGTCTGTGGAGGAAGAGATGCAAAGTACAGTTCGAGAGCA 2660 
IDLSPDDKEFQSVEEEMQSTVREH 
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2800 



CAGAC^TGGAGGTCATGCAGGTGGAATCTTCAACAGATACAATATTCTCAA(^TTCAGAAGGTTTGTAAC 

rdgghaggifnrynilkiqkvcn 
aagaaactatgggaaagatacactcaccggagaaaagaagtttctgaagaaaaccacaaccatgccaatg 

KKLWERYTHRRKEVSEENHNHAN 

AACGAATGCTATTTCATGGGTCTCCTTTTGTGAATGCAATTATCCACAAAGGCTTTGATGAAAGGCATGC 2870 
ERMLFHGSPFVNAIIHKGFDERHA 

GTACATAGGTGGTATGTTTGGAGCTGGCATTTATTTTGCTGAAAACTCTTCCAAAAGCAATCAATATGTA 2940 
YIGGMFGAGIYFAENSSKSNQYV 

TATGGAATTGGAGGAGGTACTGGGTGTCCAGTTCACAAAGACAGATCTTGTTACATTTGCCACAGGCAGC 3010 
YGI GGGTGCPVHKDRSCYICHRQ 

TGCTCTTTTGCCGGGTAACCTTGGGAAAGTCTTTCCTGCAGTTCAGTGCAATGAAAATGGCACATTCTCC 3080 
LLFCRVTLGKSFLQFSAMKMAHSP 

TCCAGGTCATCACTCAGTCACTGGTAGGCCCAGTGTAAATGGCCTAGCATTAGCTGAATATGTTATTTAC 3150 
PGHHSVTGRPSVNGLALAEYVIY 

AGAGGAGAACAGGCTTATCCTGAGTATTTAATTACTTACCAGATTATGAGGCCTGAAGGTATGGTCGATG 3220 
RGEQAYPEYLITYQIMRPEGMVD 

GATAAATAGTTATTTTAAGAAACTAATTCCACTGAACCTAAAATCATCAAAGCAGCAGTGGCCTCTACGT 3290 
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TTTACTCCTTTGCTGAAAAAAAATCATCTTGCCCACAGGCCTGTGGCAAAAGGATAAAAATGTGAACGAA 3360 
GTTTAACATTCTGACTTGATAAAGCTTTAATAATGTACAG 
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